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the ?CR products of the tumor/non- tumor DNA in the 
remaining 3 9 patients studied were the same. No 
correlation of the CAG repeat length to the aggressiveness 
or mortality of prostate cancer has been suggested. 

5 SUMMARY OF THE INVENTION 

The present invention is based upon the di sec very that 
the number of CAG repeats in the androgen receptor 
determines the aggressiveness of prostate cancer and the 
likelihood that a patient of at least about 60 years of age 

10 will die of the disease. For total prostate cancer, a 

slight inverse association between androgen receptor CAG 
' repeat length and risk of disease was observed, but this 
was r.ot statistically significant. However. CAG repeat 
length was inversely associated with cancers characterized 

15 "as "aggressive" (extraprostatic extension (stage C or D) 
and/ or high grade) . For an increment of six CAG repeats, 
equivalent to the difference between the median CAG length 
in the upper versus lower tertile cf CAG repeats, the 
relative risk of "aggressive" prostate cancer was 0.66 .95 

20 percent confidence interval, 0.44-0.96; p = 0.03) and the 
relative risk for developing distant metastatic prostate 
cancer was 0.41 (95 percent confidence interval, 0.21-0.51; 
p = ; .01) . CAG repeat length was not associated with non- 
aggressive disease. Results presented herein demonstrate 

25 an inverse correlation between CAG repeat length and 
indicators of disease progression (p, trend, = 0.005). 
Risk of advanced, aggressive, or fatal disease was 
particularly strongly related to CAG length among cider 
men 

30 The results herein also provide evidence that the 

variability in the androgen receptor CAG mi crosat el 1 i t e 
influences the risk of developing "aggressive" prostate 
cancer. As a result, a method cf predicting the onset cf 
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method z? predicting the: risk of prostate cancer 
morbidity and mortality 

background the i:jvention 

Proscace car.cer is the most common nalignan: tumors 
5 and the second most common, cause of cancer death in 
American males. Senoeneerg et al . , 3icchemical and 
Biophysical Research Communi cat ions , Vol 198, No 1, pp 74- 
80 (1934). Racial differences m the incidence of disease 
have also teen observed, with the highest incidence m the 

10 African- American copulation, followed by Caucasians. The 
incidence cf the disease is lowest m Asians . 
Interestingly, the androgen receptor gene contains a high!'/ 
polymorphic CAG micros a tel 1 ite in exon 1, resulting in a 
variable length glut amine repeat. The CAG repeat MEAN 

15 lengths observed in African- Americans , Caucasians and 

Asians are IS, 21 and 12, respectively. While the androgen 
receptor gene has been speculated to possess some 
relationship with prostate cancer, the nature cf that 
relationship is unknown and the subject cf speculation. 

20 Coetzee and Ross Journal of the National Cancer Institute, 
Vol . 86 , No. 11 (1994 ) . 

The human androgen receptor gene has been assigned 
chromosomal location Xqll-12 with the polymorphic CAG 
repeat region located at position 172 following the 

25 translation start coder.'. The polymorphism in the human 
androgen receptor gene has been used to diagnose families 
with the Andre gen Inser.sit ivicy Syndromes, employing the 
polymerase chain reaction (FCR). Sleddens et al . , Nucleic 
Acids research, Vol 20, No 6 , p. 14 27. 

30 The relationship of the CAG repeat cf the androgen 

receptor { AR! gene and prostate cancer has been studied. 
Schoenberg et al., supra, describe a somatic contraction of 
the repeat re gi en in one patient with prostate cancer, yet 
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androgen insensitivity , including hypogonadism, reduced 
fertility with oligospermia or azoospermia , and 
gynecomastia despite normal serum testosterone levels in 
men (LaSpada, A.R., etal., Nature, 352:11-3 (1991); 
5 Arbizu, T., etal., J, Neurol. Sci., 5?:3Ti-92 ;i983); 
Igarashi, S., etal., Neurology, 42:2300-2 (1992)). 

Because of their role in prostate ceil division, 
androgens are believed to influence the initiation cr 
promotion of prostate cancer (Ross, P..?'.., ec al . , cancer, 
10 75:1778-1732 (1995)). Moreover, the variation in androgen 
receptor transact ivat ion related to polymorphism in CAG 
repeat length could influence occurrence or progression of 
prostate cancer. Ccetzee and Ross have hypothesized that 
the generally shorter CAG repeat lengths m the AR among 
15 African-Americans may contribute to their high incidence of 
prostate cancer, particularly advanced cancer (Coetzee, 
G.A., Ross, R.K., J. Natl. Cancer Xnsz., 5e:S72-3 (1994)). 
A slight inverse association between CAG repeat length and 
risk of prostate cancer has been reported, but this finding 
20 was based on only 47 cases and was not statistically 

significant (Irvine, R.A., etal.. Cancer Res . , 35:1937-40 
(1995)). Hence, the relationship between polymorphism in 
CAG repeat length in the AR and prostate cancer development 
and progression in a large cohort study was examined, the 
25 Physician's Health Study. 

As set forth above, the invention relates to a method 
for prognosis of prostate cancer in a male comprising: (a) 
determining the length of the CAG trinucleotide repeat of 
excn 1 of the androgenic receptor gene and/or the length of 
30 the TA dinucleotide repeat in the 5-alpha reductase gene 
obtained from DNA of the male and :bj correlating the 
length of the repeat with the aggressiveness and mortality 
risk of the cancer in the male . 

The invention also relates to a method for determining 
35 length of a CAG t r inuc iect ide repeat in excn 1 of the 
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aggressive prostate cancer and the risk of mortality from 
the prostate cancer is available. 

The present invention additionally relates tc the 
discovery that the length of the TA repeat polymorphism in 
5 the 5 -alpha reductase gene is directly related with risk of 
aggressive disease. Thus, the invention relates to a 
method f :r prognosis of prostate cancer in a male 
comprising: (a) determining the length of the TA 
dinuclectide repeat in the 5-aipha reductase gen'e and -(b) 
10 correlating the length of the repeat with the risk of 
prostate cancer in the male. 

DETAILED DESCRIPTION OF THE INVENTION 

Cell division in the prostate gland is controlled by 
testosterone (Coffey D.S., UICC Technical Report Series, 

15 43: A -22, Geneva: International Union Against Cancer, 

(1979)). Zr. z he prostate cell, testosterone is converted 
to dihydrot estcstercne ( DKT ; through the action of 5 - alpha - 
reductase .Thigpen, A.E., ec al . , N.E. J. Mad., 327:1216-19 
(1 992)) . EHT binds with the androgen receptor (AR) in the 

2 0 cell nucleus, and tne DH7- A?, complex interacts with 

specific DNA sequences, resulting in up- or down- regulat icn 
of target: genes. Encoded in excn 1 of the AR gene on the 
X-chrcmcsor.s are polymorphic CAG micrcsa tel 1 i tes . The CAG 
repeats, which range normally from about 8 to 31 repeats 

25 and average about 20, (Edwards A., ec al . Genomics 12:241- 
53 (1992)) encode for polyglutamine chains in the 
transact ivation region of the AR . In transfection assays, 
the lengths of these polyglutamine chains correlate 
inversely wich trans act i vat ion of the AR { (Chamberlain, 

30 N.L., ec al., Nucleic Acids Res . , 22:3181-5 (1994); Kazemi- 
Esfarjani ?., ec al . , Human Molecular Genecics, 4:523-7 
(1995)) . Expansion of the CAG microsateilite to 40 to 62 
repeats, which causes X- linked spinal and bulbar muscular 
atrophy ( Kennedy ' s disease ) , leads to signs of relative 



WO 97/17469 



POVUS96/1778S 



the prostace. A TA dinuclectide repea: polymorphism exists 
in the 3' untranslated region of the 5 alpha reductase, 
Type II, gene. The 5 alpha reductase alleles with longer 
TA repeats are more common in African -Americans , the group 
5 with the highest incidence of CaP. While investigators 

speculated that the length of the TA repeat region of the 5 
alpha reductase gene in the germline of T.ales was inversely 
related to the later incidence of prostate cancer or its 
morbidity, the results reported below support the opposite 

10 conclusion. 

The 5 alpha reductase converts testosterone to 
d i hydro testosterone (DHT) , the most potent natural ligand 
of the androgen recectcr. Two isozymes of 5 alpha 
reductase exist Jenkins. E.F., et al., C. Clin. Invest., 

15 : 2 93 - 3 00 (1992)). The 5 alpha reductase, Type I , has its 

gene on chromosome 5 and codes for a protein which is 
expressed in the liver, skin, and scalp Jenkins, E.P., et 
al., Genomics, 11:1102-1112 -1991); Thigpen, A.E., et al . , 
J. Clin. Invest., 92:903-910 '1993);. There is no known 

20 phenotype for mutations of this first isozyme (Thigpen, 

A.E., et al. r J. Clin. Invest., ?£:903-?10 (1993)). The S 
alpha reductase, Type II (SAR5A2) has i:s gene on 
chromosome 2 and is required for the development of the 
male external genitalia and growth of the prostate (Wilson, 

25 J.D., Ann. Rev. Phys . , 40:279-306 (1978'?. Deficiency of 5 
alpha reductase, Type II, activity leads to a phenotype 
known as pseudohermaphroditism (Thigpen. A.E., et al . f J. 
Clin. Invest., 90:799-809 (1992)). Affected boys have 
ambiguous external genitalia and a rudimentary prostate 

30 (Wilson, J.D.. Ann. Rev. Phys., 40:279-3:6 (1978); 

Andersson, S . , et al., Nature. 354:159-161 (1991)). In 
older men, 5 alpha reductase activity is present in the 
stroma of normal prostate and increased in stroma 
associated with benign prostatic hypertrophy (Silver, R.I., 

35 et al., J. of Vrclcgy, 152:433-437 '1954) 
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androgenic receptor gene and/or the length of the TA 
dinucleotide repeat in the 5 - alpha reductase gene or its 
complement ir. a male patient having prostate cancer 
comprising; .a) obtaining DMA from the patient wherein the 
5 DNA comprises tne CAG trinucleotide repeat of exon I of the 
androgenic receptor gene and /or the length of the TA 
dinucleotide repeat in the 5 - alpha reductase gene or its 
complement; -bi determining the length of the repeat; and 
(c) comparing the length of the repeat with the length of 

10 the repeat in a significant number of individuals; wherein 
the length cf \ : : e repeat is prognostic of the 
aggressiveness and mortality of the prostate cancer. 

As det a iled above, the length of the AR CAG repeat in 
the germiine is inversely related to the onset of 

15 aggressive prostate cancer and mortality due to prostate 
cancer, particularly in males over about 60 years of age. 
The male to be tested can be of any race, including 
African-American. Caucasian or Asian. A suitable ccntrolor 
comparison can be cc tamed for example, from males, 

2 0 including males on ail races . Accuracy of the method can 

be increased by comparing the length of the CAG repeat m 
the male patient with the mean cr average values of the 
length of the CA3 repeat in males of the same race. That 
is, an appropriate control for comparing the length of the 

25 repeat as a prognostic can include the mean and/or average 
length of the repeat in a population of males of the same 
racial background cr origin. Of course, random selection 
of a significant number of males improves the statistical 
significance of the control population . 

*s0 Another embodiment of the invention includes screening 

for a TA repeat polymorphism m the 5 alpha reductase gene . 
The development ana progression of prostate cancer ( Ca? } is 
believed to oe influenced by androgen hormones. The 5 
alpha reductase, Type II, converts testosterone to 

3 5 dihydrotestosterone ana is crit ical to the development of 
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To study Che three allele families and their 
association with CaP a case centre! study was performed of 
36B prevalent cases of men with prostate cancer and 368 
matched controls all participants in the Physicians Health 
5 Study. The polymorphic nature of this gene and the 

relative allele frequencies reported by Reichardt et al . 
was confirmed as described belcw. A statistically 
significant, decreased risk of prostate cancer among 
patients homozygous for the longer TA allele families, a 

10 truly surprising result, was also :eTorsc rated . 

Either DMA or RNA can be used ir. the present method. 
The DNA which can be used in the met hoc can be cDNA or 
genomic DNA, preferably genomic DNA. The source' of DNA can 
be from any cell or cells removed frox the individual and 

15 can include cultured progeny there::" . Since the invention 
does net rely upon the identification zi somatic mutation 
in the tumor, but is preferably analyzing germline DNA , the 
DMA can be isolated from ncn -cancerous cells, such as 
somatic tissue or a blood sample . A \sz because the DNA 

2 0 which is preferably analyzed is germline DNA, the 

prognostic method can be carried cut prior to onset of 
disease. This significant advantage can be used to 
establish a cancer screening schedule prior to onset of 
prostate cancer and treatment protocol upon onset due to 
25 the risk factor assigned by the described method . 

The AR CAG repeat length cr 5 -alpha reductase TA 
repeat length can be determined using methods generally 
known in the art, such as by FCR described herein belcw). 
Alternatively, the DMA comprising the repeat or its 

3 0 complement can be sequenced, thereby identifying the repeat 

length. In yet another embodiment, the protein encoded by 
the DNA can be sequenced cr identified, t he re by 
establishing the length of the repeat. Since CAG encodes 
the amino acid glut amine , the identification cf the number 
3 5 of glut amine residues in the ccrre s: ending region of the 
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Because of ::s role in proscace ontogeny and growth, 
alterations in the function of 5 alpha reductase, Type II, 
could potentially affect an individual's risk of CaP. Even 
small alterations in the function of 5 alpha reductase 
5 could, over a lifetime, decrease levels of int raprostat ic 
DHT significantly enough to alter the incidence of prostate 
cancer . 

Different levels of androgen hormones have been 
suggested as one possible explanation of the observed 

10 difference in rates of CaP between ethnic groups. Ross and 
colleagues measured surrogate markers of 5 alpha reductase 
activity m young Japanese, African -American , and Caucasian 
^en . They found Japanese men, who have tne lowest rates of 
Ca?, to have hormone levels consistent with lower 5 alpha 

15 reductase activity than African -American and Caucasian men 
(whose hormone levels were not significantly different from 
one anotnerj (Ross, R . K . , et al . , The Lancet, 535:937 -339 
■1992!). This indirectly suggested that the activity of 
this er.zyne may play a role in the low rates of CaP 

2 j observed in Japanese men. 

3RD5A2 has a polymorphism in its 3' untranslated 
region. Russell et al. demonstrated three alleles which 
differ in the number of TA cinucleotide repeats, TA1G), 
TA(9), and TA(16i (Davis, D.L. and Ruseil, D.W., Human 

25 Molec. Genetics, 2:820 (1993)). Although there is some 
minor variation in the exact number of TA repeats, the 
labels adequately describe the three clusters of families 
observed {Ross, R.K., et al . , Cancer, 75:1778-1752 U995);. 
Recently, Reichardt et al . confirmed that the TA(0> allele 

30 family is most common and the TAU6) allele family is found 
almost exclusively in African- American men (Reichardt, 
J.K.V., ec al., Cancer Res., 55:3973-3975 (1995);. The 
hypothesis has been set forth that these longer alleles may 
be associated with an increased risk of CaP and may 

25 partially explain the observed racial differences in CaP. 
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employed. One example of capillary electrophoresis is in 
polymer network consisting of 8% 

poiyacryloylaminoethoxyethanoi in the absence of cross- 
linker, and offers a simple procedure for separation and 
5 on-line detection via UV absorbance at 254 nm , thus 

avoiding additional staining steps. The capillary column 
can be used repeatedly and the e iec t ropherogram can be 
stored on magnetic support. Comparisons among different 
runs can be obtained aligning all tracings to an internal 

10 standard of a known base pair size added as a marker ' Mes: 
et al . , Electrophoresis , 15:544-6 : 1 9 9 4 ) ) . 

In yet another embodiment, the number of repeats can 
be determined according to the net hod of Vamamoto et al . 
(3iochem. Biophys . Res. Comm., 132:507 '1392)). The DNA 

15 obtained from the male containing the repeat is amplified 
by standard ?CR, a primer extension is carried out 
following addition of dideoxy ATP to the reaction mixture. 
The extension of the end- labeled reverse primer adjacent t 
3' end of the repeats stops at the first T after the 

20 repeats and the resultant primer products can be analyzed 
by denaturing polyacryl amide gel electrophoresis and 
autoradiography . 

Additional FCR based methods which can be used includ 
random rapid amplification of cCMA ends (RACE) , described 

25 by Carney et al . (Gene, 155:299. 1995); single strand 

conformation polymorphism analysis ' Ris - Stalpers et al . , 
Pediatric Res., 36:227 (1994)) and reverse transcriptase 
FCR (Nakamura et al . , J. Neurological Sci . 122:74 (1994)). 
Additional hybridization techniques include the use of 

30 crobes of varying CAG repeat lengths labeled with the same 
or different radioactive or fluorescent dyes, for example. 
This method allows for the direct detection of CAG repeats 
(see, e.g., Sanpei et al. , Biochem. Biophys. Res. Comm. 
212:341-6 (1995); Taneja, J. Ceil Biology, 128:995-1002 
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androgen receptor pro: e in directly indicates the number of 
CAG repeats. In yet another embodiment, an antibody which 
binds a pclygiutamine residue selectively by length can be 
made and used to screen a protein fraction which contains 
5 the androgen receptor. 

The number zi CAG repeats in the AR gene or the number 
of TA repeats m the 5 -alpha reductase gene can be 
determined by methods known in the art. The source of DNA, 
cDNA or RNA can be f rex patient biological samples, such as 

10 blood, biopsy tissue, sperm, fibroblasts or other somatic 
or germline ceils. 

One such method is PGR methods using a pair of primers 
specific for sequence flanking the CAG repeat region of 
exen 1 or the TA repeat region in the S alpha reductase 

IS gene. The resulting products can be sequenced, analyzed 
for size or. gels, such as poiyacryl amide or agarose gels, 
or evaluated by physical characteristics such as melting 
temperature or secondary structure. Other methods for 
determining size cf r.u:ieic acid fragments can be employed. 

20 Co - arr.pl i f icat ion cf two alleles in a net ere zygote can 

generate ?CR products which differ in the number of repeats 
and therefore their melting and secondary structure 
characteristics are likely to differ. Under conditions as 
described in, e.g., Mutter and Boynton (Nucleic Acids Res. 

25 23:1411 (1995), amplification efficiency of the two alleles 
is near-equivalent, generating PCR products in a ranio 
proportional to that cf the genomic template. Variability 
and biasing can be diminished by substitution of 7-deaza- 
2'-dGT? for dGTP during amplification, an intervention 

30 which reduces stability of intramolecular and 
intermoiecuiar GC basepairing. 

Allelic PGR fragments are easily separated, for 
example, by gel electrophoresis and detected by 
intercalating dye staining (e.g., ethidium bromide). As an 

35 alternative procedure, capillary electrophoresis can be 
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The term "prognosis" is defined herein as the 
judgement in advance concerning the probable course of a 
disease and/or the chances cf recovery. 

The invention can be utilized particularly 
5 advantageously in combination with the information made 
available in other screening assays and risk factor 
assessment methods and criteria. 

The present invention will new be illustrated by the 
following examples, which are net intended to be limitina 
10 in any way. 

EXEMFliriCATIC:: 

EXAMPLE 1 
Methods 

Study population 

15 The Physician's Health Study is an cngcing randomized 

double - bi ind , placebo -controlled trial tf bet a - carotene 
among 22,071 U.S. male physicians, aged 4 C to 34 years in 
1982. The cohort is predominantly white -over 95%) . Men 
were excluded if they reported a prior history of 

20 myocardial infarction, stroke, transient ischemic attacks, 
unstable angina, cancer (except fcr non-neianoma skin 
cancer), current renal or liver disease, peptic ulcer or 
gout, contraindication to use of aspirin, or current use of 
aspirin, other platelet - act ive agents or vitamin A 

25 supplements. The trial had included an aspirin component 
that was terminated in January, 193 9 due primarily to a 44% 
reduction in the risk of a first myocardial infarction 
among those in the aspirin group i Steering Committee of the 
Physicians' Health Study Res. Group, ?! . E . J. Wed., 321:129- 

30 35 (1989) ) . 

Study participants completed two mailed questionnaire 
before randomization in 1952, and additicnal questionnaire 
at six months, 12 months, and annually thereafter. Before 
randomization blood kits were sent to all participants witr 
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(1995; and Saito, Japanese Journal of Human Genetics, 
39:421-5 (1994) ) . 

In yet another embodiment the protein which is encoded 
by a repeat -containing fragment or the gene, or in the 
5 alternative, the nucleic acid, can be separated by size 
using art - recognized separation media and methods. 
Standard poiyacryiamice gels or a modified SDS-PAGE 
protocol using low concentration of me thyl e neb is aery 1 amide 
and long runs (Ide et ai . , Biochem. Biophys. res: Comm.- 

10 209 : 1119 ( 1 9 S 5 > ) . 

Alternatively, reverse blot techniques can be employed 
for determining a small number cf repeats or differences in 
repeats as described by Wehnert et al . {Nucleic Acids Res . 
22:1701-4 11994);. In this method, oligonucleotides 

15 representing trinucleotides i 2 lners ) tandem repeats are 

directly synthesized and arrayed onto an aminated substrate 
(e.g., polypropylene). DNA samples of different 
complexities can be used and are radiolabeled and 
hybridized to the oligonucleotide array. The reverse blot 

2C system specifically identifies trinucleotide short random 
repeats (STRs) . There is low to no random or 
crosshybridizaticn to nonspecific sequences and it is 
possible to detect as few as tnree repeated units in a 
particular location. Varying the hybridization stringency 

25 can enhance the detection cf STRs. This single step 

reverse blot system therefore allows the rapid, specific 
and sensitive identification of various STRs in DNA sources 
of different complexity . 

In yet another embodiment, CAG binding proteins, TRIP- 

30 l and TRIF-2 , as described by Yano- Yanagisawa et al . 
(Nucleic Acids Res. 23:2654-60 (1995)) can be used to 
isolate CAG-containing DNA. These proteins may also 
require a minimum of eight (AGO trinucleotide repeating 
units for recognition and binding. 
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diagnosis, rumor grade, Gleason score, type of presentation 
(e.g., symptoms, screening rectal exam, etc.), and 
treatment modalities. Stage was recorded according to the 
modified Whitmore- Jewet t classification scheme ■ Beahrs 
5 O.H., et al., Manual for staging of cancer. 4th ed. 

Philadelphia: J.B. Lippincott (1992)1. If multiple tissue 
samples were examined, the highest reported grade and 
Gleason score were recorded. Cases without pathological 
staging were classified as indeterminate stage unless there 
10 was clinical evidence of distant metastases. "Aggressive" 
cases were defined as those diagnosed at stage C or D 
(extraprostatic) plus those diagnosed at stage A or 3 or 
indeterminate with either poor histologic differentiation 
or Gleason score 7 or greater. Cases with clinical stage A 
15 cr B or no pathological staging, and moderate or better 

histologic grade were classified as non-aggressive. Among 
patients with localized prostate cancers, those with poor 
histological features have increased mortality, and thus 
warrant categorization as aggressive .Gleason, C . F . , et 
20 al., J. Urology, 111:58-64 (1974)). In this cohort, 69% of 
the fatal cases occurred in men '27.8% cf total) designated 
with both advanced stage (at diagnosis? and histologically 
aggressive tutors. By 1992, 27.5% of men with tumors both 
high grade and stage had died cf prostate cancer, whereas 
25 only 4.3% of ail others had died by the end of follow-up. 

Analysis for CAG repeat length in the androgen receptor 

Since the AR gene is X - 1 inked , only one copy cf the 
gene exists m men. The CAG microsa tel 1 i te region resides 
in the coding region of the gene within the first exon. A 
30 system to rapidly analyze the CAG repeat sequence length m 
a large number of samples was established. Five hundred 
microliters of whole blood was thawed from cases and 
controls and SNA was extracted utilizing the Qiagen QIAamp 
Blood Kit. A set cf oligonucleotide primers that span the 
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instructions to have their blood drawn into vacucainer 
tubes containing EDTA , tc centrifuge then, and to return 
the plasma in polypropylene crycpreservat ion vials. The 
kit included a cold pack to keep the specimens cool until 
5 receipt the following morning, when they were aliquoted and 
stored at -32°C. During storage, precautions were taken so 
that no specimen chawed cr warmed substantially. Specimens 
were received f rem 14,915 (66V) of the randomized 
physicians; ever 70% between September and November, 1.982. 

10 Selection of Cases and Controls 

When a participant reported a diagnosis of cancer on 
the f cl lew- up quest icnnaires , medical records, including 
pathology reports, that were reviewed by study physicians 
from the End ?c:r.cs Committee were requested. By March, 

15 ^992, 520 cases cf prostate cancer, of which 365 had 

provided blood, were confirmed. The lack of blood samples 
for some study participants is unlikely to have introduced 
selection bias, since it is unlikely that physicians who 
did or did not provide a sample would differ in terms of 

20 the relationship of the Ax CAG microsat el 1 i t e to subsequent 
prostate cancer experience. For each case, one control who 
had provided plasma, had not had a previous prostatectomy, 
and had not reported a diagnosis of prostate cancer at the 
time diagnosis was reported by the case, was selected. 

25 Controls were also matched on smoking status and age within 
one year, except for several very elderly cases for whom 
age had to be matched within two years. After 10 years cf 
follow-up, over 99% of the men were still reporting 
morbidity events, and vital status was ascertained for 

30 100%. 



Medical Record Review 

A study physician, unaware of assay results, reviewed 
medical records for each case to determine stage at 
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fatal cases) could be utilized. Analyses limited to cases 
to examine various parameters of aggressive behavior 
(stage, grade, fatality) in relation :c -A3 repeat length 
were also conducted. 
5 CAG repeat lengths as a continuous variable in 

logistic models were analyzed, which -axinizes efficiency 
under the assumption that a one unit increment in CAG 
repeat length is related to a constant increase or decrease 
in the natural logarithm of the cdds ratio. The p-value 
10 for the continuous variable fcrxed the basis of the test 
for trend. Men were categorized into groups to observe if 
nonmonotonic increases existed across levels of CAG 
repeats (e.g., if a threshold existed'. The categorization 
(ranging from < 19 to >26 repeats- was based on 
15 approximating a relatively equal distribution of the 

values, although the numbers m the categories fluctuated 
somewhat because of the very uneven distribution. All 
decisions for categorization were conducted before the 
analyses were conducted. Potential confounding by alcohol 
20 consumption, multivitamin use, tody -ass index and exercise 
level on the 19S2 questionnaire, and aspirin use based on 
randomization, -was addressed by including these as 
covariates in multivariate models . All reported p-vaiues 
are based on two-sided tests. 

25 Results 

The range in CAG repeats m the AR ranged from 14 to 
32 among cases and from 9 to 3? among controls. The mean 
(and standard deviation) for CAG repeats was 21.37 (3.07) 
for cases and 21.95 (3.46) for controls . The difference in 
30 means was not statistically significant. As had been seen 
in other Caucasian populations, a bir.cdal distribution was 
observed with a primary peak at 2 1 CAG repeats and a 
secondary peak at 24 to 25 CAG repeats Edwards A . , et al . 
Genomes 12 :241-53 U992 P. 
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CAG repeac ( 5 ' TC 3AGAA7CTG77CCAGAGCG73C3 ' (SEQ ID NO : 1 ) and 
5 ' GCTGTGAAGGT7GC73TTCC7CAT3 ' (SEQ 17 NO : 2 ) ) were 
conscrucced. 7 ha DMA was amplified using these primers by 
polymerase chair, reaction C?CR) to produce fragments of the 
N- terminal dorr, a i r. of the AR. 7 he length of these fragments 
varied only by tr.a nur.be r cf CAG repeats. For rapid and 
accurate assessrent cf fragment length, the DNA fragments 
were run on a 5% denaturing poiyacryiamide gel by automated 
fluorescence detection (Genescan Applied Biosyst ems ) . 
Using a series cf sequenced PC?, products of varying size, 
DNA markers were used to create a standard curve cf peak 
arrival time tnat in turn was used to calculate the length 
of an unknown ?Z?. product automatically. Resolution of 1 
base pair using this system was confirmed with direct DNA 
sequencing. The assays were conducted by laboratory 
personnel blinded to case-control status. Split samples 
were used tc ensure quality control. It was possible to 
amplify 367 of tr.e IzS cases. 

Data Analysis 

Analyses zc determine whether AR CAG repeat length was 
related to total prostate cancer and, secondly, to 
malignancies cf increased aggressiveness were conducted. 
Aggressive behavior was determined by combinations cf 3 
sets of data, histology (tumor grade or Gleason score), 
tumor stage, and fatality. The relative risk (estimated by 
the odds ratici cf developing total, aggressive, non- 
aggressive, high-grade, advanced - st age , and fatal prostate 
cancer was examined. Unconditional logistic regression, 
controlling for age and smoking, the matching variables, to 
compute odds ratios and 95 percent confidence intervals, 
after first conducting conditional logistic regression to 
confirm similar results was used. Ey using unconditional 
logistic regression, information from all controls in the 
analyses limited to a subset of cases (e.g. aggressive or 
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Next the relative risks (odds ratios, OR) for total 
and sub-groups zz prostate cancers by CAG repeat length 
were examined. For total prostate cancer, a slight inverse 
association between CAG repeat size and risk of disease was 
5 observed, but this was net statistically significant. 

However, statistically significant inverse associations for 
cancers characterized by various indicators of aggressive 
or advanced disease was noted, whereas no association was 
noted for non-aggressive cases. Only 36 of the cases of 

10 prostate cancers were fatal in this time period, but a 

strong inverse association between CAG repeat length a'nd 
fatal prostate :ancer was observed, although this just 
missed attaining convent icna 1 statistical significance. AR 
CAG length was r.ct correlated with any cof actor considered 

15 (age, year cf* diagnosis, alcohol intake, physical activity, 
multivitamin use, body mass index, and aspirin use); hence, 
the results were unaltered when these were included as 
covariates in medals. Also, Table 1 reports odds ratio for 
a six-CAG increment '.equivalent to the difference between 

20 the median CAG repeat between the high and low tertiles). 
Also shown are results for high grade and advanced stage 
lesions separately. 
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The mean CAG repeat length among the different classes 
of tumors was examined. Men' with non-aggressive tumors had 
a slightly higher CAG repeat length than controls, but this 
was not statistically significant. Aggressive cases, 
5 defined by both grade and stage, had lower mean CAG repeat 
lengths. These differences were statistically significant 
for advanced cases <P = 0.02) and high grade cases 
{? = 0.03), or either (? = 0.01), for distant metastatic or 
• fatal cases (P = 0.008). The difference in fatal cases was 
10 less striking (P=0.06>. A test for trend in CAG repeat 

length across 3 levels of disease ■ nor. - aggressive disease, 
aagressive but regional disease, ar.d distant disease) using 
croaressive ordinal values was highly sigrii: icant 
(P = 0.005) in a linear regression Tcdel 'Table 3). 



TABLE 3 





n 


Means 'SO) 


P-Value 


Controls 


367 


2 1 ? 5 
f 3 .46' 




Stage A or B and 
Gleason s 6 


195 


' 3 . 1-1 1 




S cage C or D cr 
Gieason z 7 


L82 


2 1 . -4 7 


0.01 


Stage C or D 


13 9 


21.47 
' 2 . 9 9 ) 


0 . 02 


Gleason 2 7 


134 


21.49 

;3 _ r n i 


0.03 


Fatal Cases 


36 


«'2 . 94 > 


0 . 06 



Discussion 

A low number cf CAG repeats in exor. 1 of the AH gene 
was closely related to aggressive behavior m prostate 
cancer, as defined by various measures including 
histological grade, stage at diagnosis, and mortality. 
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The relationship between CAG repeat length and risk of - 
prostate cancer by ace group was examined. No appreciable 
association in men younger than about 6 0 years, but 
progressively stronger inverse associations for men 60 to 
5 69 years and men 7C years or older for prostate cancer 
(Table 2) were found. A statistically significant 
multiplicative interaction (P = 0.015) existed between age 
of disease and CAG length for total prostate cancer, as 
well as for most of the sub-groups of cases. In essence, 

10 among the rr.sn over the age of 6 0 years, z he CAG repeat 

length was an important predictor of risk, whereas among 
those under 6 0 years, CAG repeat: length was weakly or 
unrelated :: risk. Risk of advanced, aggressive, cr fatal 
disease was particularly strongly related to CAG length 

15 among the cider men. 



TABLE 2 





Total 


Men < 65 vrs eld 


Men i 65 vrs old 


Tocal Prostate 
Cancer Cases fn) 


367 


199 


168 


CR (95% CI) 


0.75 ; o . 4 a - 1 . 1 7 ) 


1.05 (C . 56 - 1 . 95 ) 


0.52 (0 . 27-1 . CI ) 


Ncn-Aggressive 
Prostate Cancer 
Cases (nl 


185 


90 


95 


OR (95% ex; 


1 . 01 (0.56-1.74) 


1.40 < C . 63 - 3 . 1 1 : 


C . 75 (0.35-1.61) 


Aggressive Prostate 
Cancer Cases in) 


162 


109 


73 


OR (95V CI) 


0.54 (0.31-C.9S) 


0.62 (0.39-1.73) 


(0.30 (0.12- 
0.73) 


High Grade Prostate 
Cancer Cases in) 


134 


83 


51 


OR (95% CI) 


0.5 4 (0.29-1.02) 


0.85 (0. 37-1 . 95) 


0.29 (0.10-0.79) 


Advanced Stage 
Prostate Cancer 
Cases (n) 


140 


64 


56 


OR (95% CI } 


0.S2 (3.28-0. 98) 


0.52 (0.40-2.08) 


0.25 (0.05-0.69) 


Fatal Prostate 
Cancer Cases (nl 


36 


17 


19 


OR (95% CI) 


0.3 3 (0.11-1.04) 


1.18 (0. 23-6 . 10) 


0.09 (0.01-0.54) 
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androgens is ultimately mediated through the androgen 
receptor. In transfection experiments, pclyglutamine tract 
length in the AR is associated with lower transact ivat ion . 
This inverse association is length -dependent , and occurs 
5 even within the normal range of CAG repeats (Chamberlain, 
N.L., etal., Nucleic Acids Res . , 22:3131-6 (1994)). DNA 
and androgen binding, which occur in different regions of 
the AR , appear to be unaffected by CAG length. Abnormally 
high CAG repeat length (>40), which causes spinobulbar 
10 muscular atrophy or Kennedy syndrome, is associated with 
clinical androgen insensi t ivity in men 'LaSpada, A.R., et 
al., Nature, 352-.ll '1991>; Igarash: , en al . , 

Neurology,. 42:2200-2 (1992)). Limited cut inconclusive 
evidence suggests that polyglut amine length of the AR 
15 within the ncrmal range (e.g. 12-27^ correlate with 

androgenic characteristics * Legro, R.S., et al., Obstet. 
Gynecol., 83:101-106 (1994)). Given clear evidence of 
clinical androgen insensi t ivity with long CAG repeat 
lengths and the linear gradient between CAG repeat length 
20 and AR transact ivat ion in vitro, it is reasonable to assume 
that variation within the normal range is associated with 
physiologic effects, albeit subtle, in cells. 

The results suggest that polymorphisms in the CAG 
repeat lengths of the AR , which are correlated with AR 
25 transactivity. influence the promotion cr progression of 
prostatic tumors. Of note, a somatic mutation which 
resulted in a contraction of the CAG T.icrosatellite 
(CAG 24 l9 ) was observed in an adenocarcinoma of the prostate 
(Schcenbero, M.P., et al . , Siochen. Sicphys. Res. Comm., 
30 195:74-90 (1994) ) , although whether this contraction was 

directly involved in the development or progression of the 
tumor is unknown. 

While the inverse association with aggressive 
cancers was statistically significant m the entire 
35 population, the magnitude of the association escalated 
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Results were consistent whether men with aggressive lesions 
to prostate cancer- free controls or to men with non- 
aggressive prostate cancer were compared, and CAG repeat 
length tended to decrease as the indicator of 
5 aggressiveness progressed, as from regionally aggressive to 
distant disease. The reliance cn pathology reports to 
document Gleason score and tumor grade probably results in 
some degree of measurement error, but this imprecision 
would tend to attenuate any true associations. . 

10 Nonetheless, these histological parameters were strong 
predictors of mortality from the disease, which supports 
the quality of the reports. 

Prostatic cancer appears tc depend cn the presence of 
androgens (Coffey U.S., UICC Technical Report Series, 4 3 : A - 

IS 23, Geneva: International Union Against Cancer, (1579)). 
Early prostate cancer is sensitive to androgens and often 
regresses when androgen stimulation is withdrawn (Coffey 
D.S., UICC Technical Report Series, -xS:4-23, Geneva: 
International Union Against Cancer, (1979)/. Prostate 

20 cancer occurs rarely in castrated men (Hovenian, M.S. and 
~emmg, C.L., Surg. Gynecol. Cbstet., 3o:29-35 (1948) j, and 
the prolonged administration of high levels of testosterone 
has induced prostate cancer in rats (Noble, R.L., Cancer 
Res., 3 7:1929-1933 (1977); Pollard, M . , ec al . , Prostate, 

25 ^:563-568 (1982)). Patients with cirrhosis, characterized 
by high estrogen and low testosterone levels, appear to 
also be at lower risk of prostate cancer (Robsin, M.C., 
Geriatrics, 21:150-154 (1966)). However, whether hormone 
levels within normal ranges are important determinants of 

30 prostate cancer remains unsubstantiated (Zariaze, D.G. and 
Beyle, P., British J, of Urology, 59:493-502 (1987); 
Nomura, A., ec al . . Cancer Res., ^5:3515-3517 11938} ; 
Hsing, A.W. and Comstock, G.W., Cancer Epidemiology 
Biomarkers £ Prevention, 2:27-32 (1993); Barret t - Connor , 

35 E., ec al., Cancer Res., 50:169-173 (1990)). The action of 
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polymorphism in the general U.S. population may be even 
greater than our overall findings would suggest. 

Even when African- American or black men have equal 
access to health care as whites m the U.S., the black men 
5 have about a two-fold higher rate of retastatic prostate 
disease and mortality, larger tumor volumes, and higher PSA 
values (Brawn, P.N. , et ai . , Cancer, ~1 : 1 56 9-2573 (19 9 3! ; 
Mcul, J.W., et ai., JMtA, 2 74 : 12" - : 2 5 1 1995)). Although 
the ecual access to care does net assure equal utilization, 
10 these data are strongly indicative of a -ore aggressive 
tumor biology among blacks. Based or. tr.e different 
distributions of AR CAG repeats among black and white mer. 
in the U.S. (Edwards A., et al . Genc.ru rs 12:241-53 ;1992):, 
and or. our estimated relative risk of fatal prostate cancer 
15 related to CAG repeat length, we calculated a 54% greater 
risk of dying from prostate cancer among blacks 60 years 
and cider than whites. U.S. black men also have higher 
levels of bioavailable testosterone tr.an whites (Ross, 
R.K., et al., J. Uatl., Cancer Inst . . "6:45 -46 (1986;' and 
20 appear exposed to higher levels of estrogen and 

testosterone in u cere ( Henderson , 3 . Z . , ez al., Br. J. 
Cancer, 57:216-218 (1988)). Although the relationship 
between CAG repeat length and prostate cancer risk should 
be co-firmed directly in black nen. our study design 
25 (largely restricted to a single racial group) provides 
strong evidence of causality. In fact, an association 
between CAG length and prostate cancer risk observed in a 
racially heterogenous population is likely to be confounded 
by any factor (genetic or environmental: that varies across 

30 the racial groups. 

This polymorphism in the androgen receptor is 
interesting in another respect. Most kr.owr. germline 
mutations that confer higher risk of cancer (e.g. 3RCA1 m 
breast, mismatch repair genes in colon, putative prostate 

35 ? cancer suppressor gene' are characterized by early age o: 
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sharpi y with increasing age, a surprising result. Ac lease 
two factors may account for the apparently stronger 
relationship among cider men. Among younger men, a 
substantial proportion of prostate cancers is probably 
5 related to a rare, autosomal dominant, highly penetrant 
allele {Carter, 3.5., et al . , Proe . Natl. Acad. Sci . , USA, 
59:3357-3371 (1992)). Of the cumulative total of prostate 
cancers occurring by ages 55, 70, and 65 years, this allele 
appeared to be responsible for 43V, 34%, and 9% 

10 respectively, of the total oases occurring by these ages. 

Given that almost 40V of the cancers among men younger than 
the age of 60 years is determined largely by the highly 
penetrant cermlir.e nutation, the relative contribution of 
the .AR polymorphism may be substantially attenuated. 

15 Another potentially relevant factor may be the 

hormonal changes related to aging, particularly the major 
reduction m free testosterone and an increase in estrogen 
levels (Sandberg, A. A., The Prostata, 2:169-184 (1980)). 
The overall reduction in androgenici t y related to aging 

2C parallels the decreasing proportion of advanced stage and 
high grade tumors (32.2% among men under 50 years, 27.5% 
among men 6 0 to 6 9 years, ana 21.5% among men 70 years or 
older in our dataj . Possibly, that AR CAG repeat length is 
a more important determinant of transact ivat ion in a low 

25 androgen environment. A substantially larger study 
population than the current one would be required for 
sufficient power to examine the interaction between AR CAG 
repeat length and hormone levels. 

It is worth noting that, due to the low numbers of 

30 older men in the Physician's Health Study, 33% of the 

cancers were diagnosed in men younger than age €Z years. 
In contrast, almost 90% of prostate malignancies occur 
among men over the age of 60 years in the general U.S. 
population. Thus, the numerical impact of this 



WO 97/17469 



PCT7US96/1778S 



-26- 

sample was diluted to a final concentration of 2D no/fj.1 and 
stored at -20°C until needed. No storage time exceeded 6 
months . 

Twenty to 40 nanograms of sample DNA was added to the 
5 FCR reaction mixture which included primers (previously 
described by Davis and Russell (6)) 5 ' -GCTGATGAAAACTGTC 
AAGCTGCTGA-3 ' (SEQ ID NO : 3 ! and 5 ' -GCCAGCTGGCAGAACGCCA 
GGAGAC- 3 ' (SEQ ID NO : 4 ) at a concentration of 1.0 uM each 
along with 50 mM KCl, 1.5 t.M MgCl2, !25uM each dMTF, and 

10 1.5 units of AmpliTaq- ( Perk in Elmer' ir. a final volume of 
22 microliters. All amplifications were performed using 
MicroAmp' 1 tubes (Perkin Elmer) . 

A Perkin Elmer GeneAmp PCR System 96 00 thermccyler was 
programmed for two step PCR. After 2 minutes at 94 °C, 

15 samples were initially cycled 31 times with a melting step 
'at 99°C for 15 seconds and an annealing and elongation step 
at 6 8 ° C for 35 seconds. There was a final elongation step 
for 5 minutes' at 68°C. These parameters result in 
exuberant amplification of the TA{0> and TA ( 9 1 alleles. 

20 However, after the inial round of amci i f icaticn , no TA(18) 
alleles were clearly identified. One sample which did not 
amplify was subjected to different cycling parameters and 
eventually proved to be a patient homozygous for TA(19) . 
Using this patient's DNA mixed with DNA from a patient 

25 homozygcte fcr TA(0), the cycling parameters were optimized 
until a clear band for the TA(18) allele was reliably 
detected. All samples were then repeated with new 
parameters: 94°C for 2 minutes followed by 30 cycles of 
94°C for 30 seconds then 64 5 C fcr 1 minute, 30 seconds, and 

30 a final elongation for 9 minutes at 6 5 °C . Each set of 33 
samples was run in parallel with a positive control (TA(16) 
D!:A mixed with TA ( 0 ) DNA ir. a 1:1 ratio 1 and a negative 
control (H20) . Samples that had an ambiguous result or any 
set of 23 with a ooor positive control were repeated. 
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disease onset, high population attributable risk at young 
ages, but a relatively low attributable risk due to the 
sharply increasing incidence of "sporadic" cancers that 
occurs with advancing age. In contrast, the pattern 
5 characterized bv the AR CAG polymorphism is of a moderate 
gradient of risk across CAG lengths. Because this 
polymorphism influences the progression of "sporadic" 
cancers, the population attributable risk is quite high. 
For example, it is estimated that 55* of distant metastatic 
10 prostate cancer among men ever 60 years is attributable to 
CAG lengths less than 24, the cut-off between the upper and 
middle tertile. Thus, this polymorphism may play a role in 
the majority cf deaths due tc prostate cancer. 

The results provide strong evidence that the 
15 variability in the t ransact i vi ty of the A?, determines the 
risk of developing -aggressive" prostate cancer. These 
data may represent the first known germime polymorphism 
related to tumor promotion or progression in "sporadic" 
tumors. Moreover, these findings may help explain the 
20 higher rate of prostate cancer mortality among black men, 
and the tendency for blacks to be diagnosed with more 
extensive disease. 

TA POLYMORPHISM IN PROSTATE CANCER 
Methods 

25 The participants m the Physician's Health Study, as 

described above, were used in this example as well. 

Whole blood samples from cases and matched controls 
were received from the Physicians Health Study coded with 
the laboratory investigators blinded to the name and status 

3C of each sample. Genomic DNA was obtained from 500 jzl of 
the thawed whole blood using a commercially available kit 
{QIAamp DNA extraction kit, Q I AG EN , Chatsworth, CA, USA). 
DNA concentration and purity were determined by UV 
aosorbency on a Seckman DU640 spectrophotometer. Each 
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into one of the TA families described previously. The 
genotype for each sample was recorded and statistical 
analysis was performed. 

Genotype freauency by case control status, including 
5 that for aggressive cancers separately, was determined. 
Conditional logistic regression analyses using the SAS 
statistical software (SAS Institute Inc.. N T C, TSA) was used 
to compute odds ratios and 95% confidence intervals. To 
examine aggressive cases, unconditional logistic regression 

10 controlling for age and smoking, t he matching variables was 
used. This allowed ut i 1 i zat ion o f ir.forr.aticn from 
controls matched to non-aggressive cases when analyzing the 
aggressive cases. Potential confounding by alcohol 
consumption, multivitamin use, body xass index and exercise 

15 level on the 1982 questionnaire and aspirin use based on 
randomization, was addressed by including these as 
covariants in multivariate models. Ail p- values are two 
sided . 

Results 

20 The allele frequency among controls was C . 344 

(n = 621} for TA(0), 0,152 ■ n = 112) for TA ( 9 ) , and 0.004 
(n = 3) for TA(1S) . The table below indicates the 
frequencies of the 5 genotypes that we observed in this 
population by case-control status in this population. No 

25 appreciable difference in case-control status for the 
prevalence for men heterozygous in the TA ( 9 ) allele was 
found, but an excess of controls was observed for men 
homozygous for TAOl or TA(18) . 
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Sensit ivi ty experiments using DNA from known 
homozygctes fcr TA{0}, TA ( 9 } , and TA(16) demonstrated the 
assay's ability to detect a 1:5 ratio of the different 
alleles. For example, clear signal of lengths consistent 
5 with TA(0) and TA(13) were visible when 3.3 nanograms of 
TA(1S) were rr.ixed with 16.7 ng of TA(0) DHA and amplified 
with the second cycling parameters listed above. 

The ?CR reaction clearly favored the shorter alleles, 
however, and the longer bands in heterozygoses were 

10 frequently fainter than the shorter bands. The initial 

cycling parameters favored the shorter TA ( 0 ) allele to such 
a degree that no TA (15) were identified. The final cycling 
parameters reliably amplified the positive control. If 
there was any ambiguity, samples were repeated. Ail 

15 samples , with tne exception of the 4 samples with TAt'13), 
were typed consistently in both rounds of amplification. 
DNA sequencing of one representative homozygcte from a 
patient who was TA ( 0 : , TA ( 9 ) and TA(1S) confirmed that the 
bands identified correlated with the expected genotype. 

20- Similarly, the he te rc zygotes which were sequenced also had 
the expected allele sequence. 

After amplifications, 15 pi of amplified product was 
separated using a 2@ agarose gel and compared with Hindi 1 1 
digested FhiX DNA iNew England 3iolabs, Massachusetts, USA; 

25 after ethidium bromide staining. The TA allele families 

can visually be discerned as either TA(0>, TA(9) or TA(18). 
A representative hcmozygote for each TA allele family was 
purified using QIAquick Spin PCR purification columns 
(QIAGEN, Germany) and the DNA sequence determined at the 

3 C Dana "arber Core Facility. TA alleles from representative 
heterozygotes with the TA { 9 ) and ?A(13> allele family were 
isolated using a MERmaid kit (Bio 101, California, USA) and 
the DNA sequence was determined with the same methodology. 
Identification of 2-4 base pair differences is not possible 

35 with these separation methods ana each allele was lumped 
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TABLE 5 



Group 


Controls 
in = 368) 


Cases 
(n = 368) 


Aggress ive 

cases 

(n = 162) 


TA(0) /TA(0) 


3.72 (n = 2S5) 


0.745 (n = 274) 


0.759 (n ^ 138) 


TA(0) /TA(9) 


3.245 <n = 90) 


0.23-7 ( R = eS) 


0.231 (n = 42 ) 


TA(0) /TA(18) 


0.0027 fn = 1) 


0.0054 (n = 2) 


0.00 5 4 (n * 1) 


TA(9) /TA(9} 


0.03 <n = 11! 


0 .011 (n = 4 ! 


0.0054 (n = 1 ) 


TAQ6) /TA(18) 


0.0027 (n » 1) 


0 . CO (n = 0) 


0.00 (n = 0) 



Among controls for whom we had hormone levels, we 
examined Levels of testosterone (T) , sex hormone binding 
globulin (SHBC-; , dihydrctestosterone (DHT) , estradiol (E2), 
and 3 -alpha androstanediol glucuronide which is an index of 
5 5 alpha reductase activity (4) . No appreciable difference 
in means among men based on their SRD5A2 genotype (see 
Table 6) was observed. 

TABLE € 



Horr.one 
Levels 


Genotype 




TA(0) /TA(0) 


Heterozygote 


T 


4.71 (4 . 43 -4 . 99) 


5.11 (4.53-5.6? 


DHT 


0 . 392 ( C . 35-0 . 4 3 ) 


0.44 (3.3--0.5V 


T/DHT ratio 


0. 089 

(0 . 080-0 .0991 


0 . 090 

{0 . 076-0 . 104 ) 


3a ADG 


6.62 {6.12-7.12) 


6.76 ! 5. 83 - 7. 6 9) 


SHBD 


23 .6 (20.3-25.2) 


28. 5f 23. 2-33. 8!* 



Discussion 

This study provides the first case -ccr.trcl study that 
directly examines the association between the TA 
dinucleotide repeat in the 3' untranslated region of 5 
5 alpha reductase and risk of CaP. The resu.ts are contrary 
to earlier presumptions that longer TA alleles may lead to 
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7A3LE 4 



Grouo 


7A Allele Genotype 




TA ( C , /TA IC J 


TA(0} /TA<9) 
TA(C) / T A (IS) 


TA{9) / TA ( 9 ) 
TA(IB) /TAI18) 


Total 

Prostate 

Cancer 


1.0 

( reference J 


0. 94 

{0 . 65 - 1.30) 


0 . 32* 

(0.10 - 1.02) 


Aggressive 
Prostate 
| Cancer 


i . 0 

(reference J 


0.31 

(0.60 - 1.38) 


0 . 16 

(0.02 - 1.26) 



•? = 0.05 



}jex: che relative risks for cecal and for aggressive 
prostate cancers according cc genocype frequency was 
examined. Because of che rarity of the TA(lS) allele in 
this population, men wicn TAU3) and che men with TA ( 9 ) were 
5 combined. A priori decision was based on che assurr.pt ion 
that any functional effect z f either TA{9} or TA(18) would 
likely be m a similar direction. Zz was found that men 
having the TA(0)/7A(5/ or TA(0)/TAC3) genotype were not at 
aopreciably lower or higher risk of total prostate cancer. 

1G However, hornozygctes !TA(5)/TA(9) or TA (18) /TA (13)) were ac 
appreciably lower risk (OR = 0.23, CL 0.10 - 1.02). 
Although only 16 such men existed, this inverse association 
achieved conventionally statistical significance (p = 0.05, 
two sided) . Also of note, the upper bound confidence 

15 interval of 1.02 provides strong evidence against a higher 
risk of prcscate cancer among homozygoces. When analyses 
was United to aggressive prostate cancer, che inverse 
association with homozygoces became even stronger and a 
weak non- significant inverse association among the 

20 hecerczygotes became evident '.see Table 5) . 
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Employing the above described methods, morbidity and 
mortality risks can be assessed in males who have not cr 
have been diagnosed with prostate cancer. Armed with thes 
additional criteria for assessing likelihood of aggressive 
5 onset or mortality, a male identified as being of increase 
risk can be screened for prostate cancer more frequently 
and aggressively in order to identify disease onset at the 
earliest stage possible. Upon cnset of disease, the 
agaress iveness of the treatment prc:cc:i can be defined 
10 based, at least in part, by assessment zi these new risk 
factors . 



EXAMPLE 2 

The CAG Repeat Within the Andrcaer. Recertor Gene and its 
Relationship to Prostate Cancer 

15 ' The relationship between the polymorphic CAG repeat 

length of the androgen receptor gene, wr.ich is inversely 
correlated with transcriptional activation by the androgen 
receptor, and prostate cancer was further examined. The 
design was a nested case-control study within prospective 

20 cohort. The subjects were participants in the Physician's 
Health Study. The mam outcome measures were five hundred 
and eighty-seven newly diagnosed oases of prostate cancer 
detected between 1982 and 1995, and 555 controls. 

Results 

25 An inverse association between androgen receptor gene 

CAG repeat length and risk of prostate cancer was observed 
For an increment of six CAG repeats, equivalent to the 
difference between the median CAG length in the upper 
versus lower tertile c£ CAG repeats, the relative risk of 

30 prostate cancer was 0.78 {?5 percent confidence interval, 
0.62-0.99; p = 0.04). In particular, CAG repeat length wa 
inversely associated with cancers characterized by 
extrarrostat ic extension or distant metastases (stage C or 
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an increased ri = .-: of CaP based on the indirect evidence 
that they are r.:re frequent in African-Americans. 

It is clear that racial differences exist in 
distribution of -he TA polymorphism lengths.' Reichardt ec 
5 ai. ( in the largest multiethnic population yet typed, 
suggested that the TAU8) is exclusively m African- 
American men (5 . In the Physician's Health Study, which 
is comprised cf predominantly white men, there were 4 men 
with the TA ( 1 5 ) Although it is new shown that the allele 
10 is present among Caucasians, it is much less frequent than 
in African-Americans -.<!% i5\l472; compared to 13% 117/941. 
respectively) . The reason fcr this disparity and the 
clinical significance remains unknown. 

The study revealed that in a mostly Caucasian 
15 population, being homozygous for the longer allele may in 
fact be protective from Cap. This result was just within 
conventional statistical significance with a p value of 
0.05. Addinc strength to the finding was the congruous 
finding that men with longer TA repeats had a trend towards 
20 less aggressive tumors as well. This analysis was • 
underpowered because of low numbers. 

The ciolcgical significance of this TA allele is 
unknown. Similar areas of TA-rich sequence in the 3' 
untranslated regions of other genes have been associated 
25 with messenger .^stability (Zubiaga, A.M., ec al . , Mol . and 
Cellular Biol. 15 (4 i : 2219-2230 £1995)!. One hypothesis is 
that with increasing TA length there is more messenger 
instability and lower resultant levels of 5 alpha reductase 
activity. This effect will most likely be subtle and it 
20 seems entirely insistent that very little effect is seen 
with heterozygctes and only in the homczygote state does 
the longer TA repeat protect against Ca?. A lifetime of 
lower activity cf 5 alpha reductase and lower intra- 
prostatic levels of DHT may provide the connection between 
35 the TA allele and risk of CaP . 
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This inverse relationship is linear and includes the norma 
range ( Kazemi -Esf ar j ani F., et al . , Human Molecular 
Genetics, 4:523-7 (1995)). Expansion to greater than 40 
repeats which, through an unknown mechanism, causes 
5 X-Iinked spinal and bulbar muscular atrophy (Kennedy's 

disease) , leads to clinical androgen ir.sensit ivity despite 
normal serum testosterone levels in men iLaSpada, A . R . . et 
al., Nature, 352:77-9 (1991); Arbizu, T. , et al . t J. 
Neurol. Sci,, 55:371-82 (1983); Igarashi , S., et al . , 

10 Neurology, 42:2300-2 (19921). 

Several observations suggest indirectly that variatio 
m the AR poiygiut amine length, by modulating androgen 
activity, influences prostate carcinogenesis. African 
Americans, who have generally shorter CAG repeat lengths l 

15 the AR (Coetzee , G.A., Ross, R.K., J. Natl. Cancer Inst. , 
'55:872-3 (1994)), have a higher incidence and mortality 
rate from prostate cancer. The AR is located on the 
X-chromosome , and consistent with an /.-linked genetic 
component for prostate cancer is tnat history of the 

20 disease in a brother carries greater risk than paternal 
history (Woolf , CM., Cancer, 13:739-44 (I960); Monroe, 
K.R., et al., Nature Med., 1:827-9 (1995); Narod, S.A., et 
al., Nature Med, 1:99-101 '1995); Steinberg, et al . , 
Prostate, 27:33-47 (1990); Hayes, R.B., ec al., Int. J. 

25 Cancer, 60:361-4 (1995); Whittemore, A.S., et al . , Am. J. 
Epidemiol., 141:732-40 (1995)). Irvine et al . has 
suggested that certain forms of the AR characterized by 
their CAG repeats may be associated with prostate cancer 
(Irvine, R.A., et al., Cancer Res . , 55:1937-40 (1995:*). 

30 These observations led us to directly assess whether 

polymorphism in CAG repeat length m the AR is related to 
prostate cancer development and progression m the 
Fhysician's Health Study. 
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D) or high histologic grade (RR = 0.61 (95 percent 
confidence interval, 0.45-0. 82; p = 0.001). The relative 
risk for an increment cf six CAG repeats was 0.41 (95 
oercent confidence interval, 0.22-0.76; p = 0.004 J for 
5 distant metastatic prostate cancer and 0.48 (55 percent 
confidence interval, 0.25-0.95; p = 0.04) for fatal 
prostate cancer. Variability in the CAG repeat length was 
not associated with low grade or low stage disease. Among 
cases, an inverse correlation between CAG repeat length and 
10 disease progression as indicated by stage and grade 
(p =* 0.001) was observed. 

Conclusions 

The results demonstrate that shorter androgen receptor 
CAG repeat lengths predict higher grade and advanced stage 

15 of prostate cancer at diagnosis, and metastasis and 
mortality from the disease. 

Cell division m the prostate gland is controlled by 
testosterone (Coffey D.S., UICC Technical Report Series, 
JS.-4-23, Geneva: International Union Against Cancer, 

20 (1979), In the crcstate cell, testosterone is converted to 
dihydrotestcsterone (DHT) (Thigpen, A.E., at al . , N.E. C. 
Med., 327:1216-15 (1992)) which binds to the androgen 
receptor (AR) ir. the cell nucleus, and the DHT-AR complex 
then interacts with specific DNA sequences, modulating 

25 target gene activity. Encoded in exon 1 of the AR gene are 
polymorphic CAG repeats, which range normally from about 3 
to about 31 and average about 20 (Edwards A., ec al . 
Genomics 12:241-53 (1992)). The CAG repeats encode for 
colygiutamine chains in the transcriptional activation 

3 0 region of the A?.. In transfection assays, the length of 
these pclygiuta-ine chains correlate inversely with 
transcriptional activation by the AR (Chamberlain, N.L., et 
al . , Nucleic Acids Res . . 22:3181-6 (1994); Kazemi - Est ar j ani 
P., ec al . , Ku/r.an .Vol ecu! ar Genetics, 4:523-7 (1955)). 
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Selection of Cases and Controls 

When a participant reported a diagnosis c f cancer on 
the follow-up questionnaires, rr.edical records, including 
pathology reports, that were reviewed by study physicians 
5 from the End Points Committee, were requested. 3y 1995, we 
confirmed 591 cases of prostate cancer among the 14,916 who 
had provided blood. For each case, cne control who had 
provided blood, had not had a previous prostatectomy, and 
had not reported a diagnosis of prostate cancer at the time 

1G diagnosis was reported by the case was selected. Controls 
were also matched on smoking status and age within one 
year , except for several very elderly cases for whom age 
had to be matched within two years. After 13 years of 
follow-up, over 99% of the men were still reporting 

15 "morbidity events, and vital status was ascertained for 
100% . 



Medical Record Review 

A study physician, unaware of assay results, reviewed 
medical records for each case to determine stage of 
20 diagnosis, turner grade, and Gieason score. Stage was 
recorded according to he modified whitnore- Jewett 
classification scheme (Beahrs. O.H.. et ai., Manual fcr 
staging of cancer, 4 th ed . , Philadelphia: J. 3. Lippmcott 
(1992)). If multiple tissue samples were examined, the 
25 highest reported grade and Gieason score were recorded. 
Cases without pathological staging were classified as 
indeterminate stage unless there was clinical evidence of 
distant metastases. High grade/stage cases were defined as 
chose diagnosed at stage C cr D < extraprostat ic) plus those 
30 diagnosed at stage A or E cr indeterminate with either peer 
histologic differentiation cr Gieason score i cr greater. 
Cases with clinical stage A or B cr no pathological 
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M ethcds 

Scud-.- ccpulaticn 

The Physician's Health Study was a randomized double- 
blind, trial of aspirin and betacarotene among 22 f 071 U.S. 
5 male physicians, aged 4 C to 84 years m 1982 (Steering 

Committee cf the Physicians' Health Study Res. Group, N.E. 
J. Ned., 321:129-25 :1959 m. The cohort is predominantly 
white (over 25k) . Men were excluded if they reported a 
prior history of myocardial infarction, stroke, -transient 

10 ischemic attacks, unstable angina, cancer (except for non- 
melanoma skin cancer 1 , current renal or liver disease, 
peptic ulcer cr gout, contraindication to use of aspirin, 
or current use cf aspirin, other platelet - act ive agents or 
vicamin A supplements. 

15 Study participants completed two mailed questionnaires 

before randomization in 1532, and additional questionnaires 
at six months, 12 months, and annually thereafter. 3efcre 
randomization, blood kits were sent to ail participants 
with instructions to have their blood drawn into vacutainer 

20 tubes containing E DT A , to centrifuge them, and to return 
the specimens toy overnight pre-paid courier) in 
polypropylene cryopreservat ion vials. The kit included a 
cold pack to keep the specimens cool until receipt the 
following morning, when they were aliquoted and stored at - 

25 S2°C. Specimens were received from 14,916 (66%) of the 

randomized physicians. The lack of blood samples for some 
study participants is unlikely to have introduced selection 
bias, since it is unlikely that physicians who did or did 
not provide a sample would differ in terms of the 

3 0 relationship cf the AR CAG polymorphism to subsequent 
prostate cancer experience. 
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Data Analysis 

Analyses tic determine whether AR CAG repeat length was 
related to the development of prostate cancer were 
conducted. Unlike the infiltrative or aggressive type of 
5 prostate cancer, the frequency of the latent non- 

infiltrative type of cancer varies very little among 
populations !Yatani, R,, et a I . , Int. C. Cancer, 25:611-65 
(1982) ) , suggesting that factors that influence initiation 
may differ from those that influence progression of 

10 prostate cancer; hence, additional analyses of tumors with 
a more aggressive phenotype as determined by histology 
(tumor grade or Gleascn score), tumor stage, and fatality 
were conducted. The relative risk ^estimated by the odds 
ratio) of developing total, high-grade, advanced- stage , 

15 distant metastatic, and fatal pros t a : e cancer was examined. 
Unconditional logistic regression was used, controlling fcr 
age and smoking, the matching variables, to compute odds 
ratios and 95 percent confidence intervals, after first 
conducting ccnditional logistic regression to confirm 

20 similar results. By using unccndi t ional logistic 

regression, it was possible to utilize information from ail 
controls in the analyses limited to a subset of cases 
(e.g., high grade or fatal cases) . 

In addition, analyses within the cases only, were 

25 conducted to examine various parameters of aggressive 

behavior (stage, grade, fatality) in relation to CAG repeat 
length. Because AR transcriptional activation function 
decreases linearly across the entire CAG spectrum 
(Chamberlain. N.L., et al . , Nucleic Acids Res., 22:3181-6 

3 0 (1394); Kazemi - Esf ar j ani F., ez al . , Human Molecular 

Genetics, 4:523-~ f 1 9 ? 5 ) ) CAG repeats were analyzed as a 
continuous variable in logistic models. This approach 
assumes that each one-unit increment i n CAG repeat length 
is related tc a constant increase cr decrease in the 

35 natural logarithm of the odds ratio. In addition, men were 
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staging, and federate or better histologic grade were 
classified as lew grade/stage. 

Analysis for CAG repeat lenath in the androgen receptor 
Since the AR gene is X- linked, only one copy of the 
5 gene exists in men. The CAG repeat region resides m the 
first exon cf the gene. A system to rapidly analyze the 
CAG repeat sequence length in a large number cf samples was 
established. Five hundred microliters cf whole -blood were 
thawed from cases an controls and DNA was extracted 

10 utilizing the Qiagen QIAamp Blood Kit. A set of 
oligonucleotide 'primers that flank the CAG repeat 
, 5 ' T C C AG AAT C T G T T C C AG AG CG TG C 3 ' ; (SEQ ID NO : 1 ) and 
5 ' GCTGTGAAGGTTGCTGTTCCTCAT3 ' [SEQ ID NO : 2 ) were constructed . 
The DMA was amplified using these primers by polymerase 

15 cnam reaction (PGR) to produce fragments of the N - terminal 
domain of the AR . Primers were fluorescent iy labelled. 
The length of these fragments varied only by the number of 
CAG repeats. For rapid and accurate assessment of fragment 
length, the DNA fragments were run on a 6% denaturing 

20 pel vac ryl amide gel by automated fluorescence detection 
:Genescan Applied Biosystems) In the Dana Farber Cancer 
Institute Molecular Biology Core Facility. Using a series 
of sequenced FOR products cf varying size, fluorescent iy 
labelled DNA markers were used to create a standard curve 

25 of peak arrival time that in turn was used to calculate the 
length of an unknown ?CR product automatically. Resolution 
of 1 base pair using this system was confirmed with direct 
DNA sequencing. The assays were conducted by laboratory 
personnel blinded to case-control status. Multiple samples 

30 were run per lane because of fluorescence labelling. Split 
samples we re used to ensure qua lit y cent rci . 1 1 was 
possible to amplify the DNA for 557 cf the 591 cases and 
533 of the 591 controls { > 9 9 % ) . 
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divided into categories of number of CAG repeats to observe 
if ncn-monoccnic increases existed across levels (e.g., if 
a threshold effect existed) . The categorization (ranging 
from si 9 to a26 repeats) was based on approximating 
5 relatively equal numbers in the categories, although the 
numbers fluctuated somewhat because of the very uneven 
distribution of CAG repeats. Potential confounding by 
alcohol consumption, multivitamin use, body mass index, and 
exercise level cn the 1982 questionnaire, and aspirin use 
10 based cn randomization was addressed by including these as 
covariates m multivariate models. All p-vaiues are 
two- s ided . 

Results 

The number cf CAG repeats in the AR ranged from 12 to 
15 35 among cases and from 6 to 33 among controls. The mean 

{and standard deviation) for CAG repeats was 21.5 (3.1) for 
cases and 22. C (3.3) for controls. The difference in means 
was net statistically significant (F = 0.25). Among the 
cpntrois, the mode of the distribution occurred at 21 CAG 
20 repeats (17% of men), approximately 10% of the men fell in 
each of 22, 23, 24, and 26 repeats, and a sharp drop-off 
occurred at 27 CAG repeats. 

Next the relative risks (estimated by odds ratios) for 
total and sub-grcups cf prostate cancers by CAG repeat 
25 length were examined. For total prostate cancer, an 

inverse association between CAG repeat size and risk cf 
disease (P = 0.04) (Table 7) was observed. 
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Table 8. Odds ratio of prostate cancer for a CAG microsatellite repeat length 
increment of 6 in the androgen receptor gene among men in the 
Physicians' Health Study 



Prostate 
Cancer 


Cases 


Odds Ratio* 
(6 increment in 
CAG) 


95% 
Confidence 
Interval 


P-value 


Total 


587 


0.78 


(0.62-0.99) 


0.04 


High 


269 


0.61 


(0.45-0.82) 


0.001 


grade/stage** 










Low 


309 


0.98 


(0.73-1.30) 


0.86 


grade/stage 










High grade 


210 


0.63 


(0.45-0.88) 


0.007 


Advanced 


180 


0.57 


(0.40-0.81) 


0.002 


stage 










Metastatic 


56 


0,<M 


(0.22-0.76) 


0.004 


(Distant) 










Fatal 


43 


0,48 


(0.25-0.95) 


0.04 



* Odds ratio is calculated by modeling CAG as a continuous variable in an 

unconditional logistic model and computing the odds ratio for a six CAG increment 
(increment from median of low to median of high tertile of CAG repeat length). 



** Includes rumors with Gleason grade >7 or high grade or advanced stage <C or D) 
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Statist icaliy significant; inverse associations for cancers 
characterised cy various indicators of high grade or 
advanced disease were noted, whereas no association was 
noted for low grade or low stage cancer. A strong and 
5 statistically significant (P = 0.04) inverse association 
betv/een CAG repeat length and fatal prostate cancer was 
observed. AR CAG length was net correlated with any 
cof actor considered (age, year cf diagnosis, alcohol 
intake, physical activity, multivitamin use, body mass. 

10 index, and aspirin use); hence, the results were unaltered 
when these were included as covariates in models. Table S 
shows the edds fcr a six-CAG increment (equivalent to the 
difference between the median CAG repeat between the high 
and lew tercilesj. Also shown are results for high grade 

15 and advanced stage lesions separately. 
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Men with low grade/stage tumors had a slightly higher CAG 
repeat length than controls -'22.13 versus 22 . C G • , but this 
difference was not statistically significant. Case defined 
by high grade or stage had lower r.ear. CAG repeat ( iengths 
5 than low grade/stage cases. These differences were 

statistically significant for advanced cases ■? =' 0.005) 
and high grade cases { P = 0.002). and for distant 
metastatic or fatal cases fP = Z.QZS), ar.d fcr fatal cases 
( P ~ 0.04). A test for trend ir. CAG repeat length across 

10 3 levels of disease (non-aggressive disease, high grade or 
regional disease (beyond the prostatic capsule., and 
distant metastases) using progressive ordinal values in a 
linear regression model was statistically significant 
(P = 0.001) . At the extreme ran-e of CAG repeats, the 

15- relationship between repeat length and aggressive phenctype 
was particularly strong. Comparing men with repeat lengths 
slS to those &30, the odds ratio ::r high grade/stage 
versus low grade/stage prostate :ar.cer was 30; although 
only 24 men fell in this range -.4% of the total), this 

20 result was statistically significant •' ? = 0.005). 

Tumors with high. grade are more likely to be of advanced 
stage, but even after excluding those with both high grade 
and advanced stage, shorter CAG repeats were observed 
independently for high grade (P = C.03I and advanced stage 

25 (P = 0.02) cases only. Thus, CAG repeat length was 

independently related to both tumor grade and stage at 
diagnosis . 

Discuss ion 

Cell division in the prostate gland is mediated through 
30 androgens. Various lines of evidence suggest that the 
occurrence and orccrressicn of malignancies of this gland 
are influenced by androgen stimulation. Prostate cancer is 
sensitive to androgens and otter regresses when androgen 
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Initially observed these relationships were observed 
in 3£7 cases rr.cstly diagnosed by 1991 before he widespread 
use of prostatic -specif ic antigen (PSA) for screening. 
Subsequently this association was confirmed in 220 new 
5 cases diagnosed after March 1932, during the era of 

prevalent use of PSA for screening. The combined 537 cases 
comprise the cases described in this report. The relative 
risks were very similar in the initial analysis (for high 
stage ''grade lesions, RR {for a CAG increment of 6) = 0.-66 

1C (95 percent confidence interval = 0.4 5 - 0.96; ? = 0.03) , and 
RR = 0.52 {55 percent confidence interval = 0.29-0.91; 
P = 0.02J , for cases during the subsequent t irr.e period. Ho 
appreciable association was observed for low grade/stage 
cancers during either time period. 

15 Next the relation between AR CAG repeat length in the 

cases alone, assessing the different classes of tumors 
i Table 9) was examined. 

Table 9. Mean CAG length in the androgen receptor gene (± standard error 
of the mean) 





n 


Mean CAG (± 
SEM) 


P-value 


Low grade/stage Prostate Cancer 


309 


22.18 (r0.19) 




High grade/stage Prostate Cancer 


269 


21.36 (±0.18) 


0.002 


Advanced Prostate Cancer 


180 


21.36 (±0.22) 


0.005 


High-grade Prostate Cancer 


210 


21.42 f ±0.20) 


0.007 


Metastatic Prostate Cancer 


56 


20.89 (±0.38) 


0.006 


Fatal Prostate Cancer 


43 


21.05 (±0.46) 


0.03 



P-value based on t-test for difference versus mean 
androgen receptor gene CAG length among low grade/stage 
prostate cancer cases. 
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Given clear evidence of clinical androgen insens itivi ty 
with long CAG repeat lengths and the linear gradient 
between CAG repeat length and AR transcriptional activation 
in vitro, a reasonable supposition is that variation within 
5 the normal range is associated with differences in 

transcriptional activation, albeit rudest, in vivo. Based 
on the assumption that androgens are critical to prostate 
cancer development or progression, C:etzee and Ross 
{Coetzee, G.A., Ross, R.K., J. Natl, Cancer Inst., 86:872-3 

10 { 1994 ) ) had hypothesized that variat ion in 

transact ivat ior.al activity by the AF. . related to 
polymorphic CAG repeats, influences prostate 
carcinogenesis. Also of potential relevance, a somatic 
mutation resulting in a contraction if the CAG 

15 microsatellite (CAG 24 . 18 ) was observed in an adenocarcinoma 
of the prostate (Schoenberg , K . P , , e: a 2 . , 3iochem . 
Biophys. Res. Comm. , 195:74 - 80 (1994 , although whether 
this contraction was involved in the development or 
progression of the tumor or is an epiphencmer.cn is unknown. 

20 The hypothesis that polymorphism in the CAG repeat which 

influences transcriptional activation function of the AR is 
related to prostate cancer development was examined. This 
hypothesis was tested in a large, prospective study, and it 
was found that variability in the CAG repeats of the AR was 

25 associated winh prostate cancer and vas particularly 

closely related to an aggressive phenotype, as de f ined by 
high histological grade, extension t n rough the prostate 
gland, presence of distant metastasis at diagnosis, and 
mortality from the disease. A highly significant 

30 association occurred independently fir both tumor grade and 
stage, increased in magnitude with degree of aggressive 
behavior, such as distant metastases and mortality, and 
occurred consistently over time i n this cohort, arguing 
strongly that this was not a chance finding. 
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Raven Series, 43:4-23, Geneva: International Union Against 
Cancer, (1979) . Malignancies of the prostate occur rarely 
in castrated men {Hovenian, M.S. and Deming, C.L., Surg. 
3ynecoi. Gbstet., 36:29-25 (1948)), and the prolonged 
5 administration of high levels of testosterone has induced 
prostate cancer in rats -.Noble, R.L., Cancer Res., 37:1929- 
33 (1977); Pollard, M., ec al . , Prcscate, 4:563-6 (1962)). 
While abnormally low levels of androgens are associated 
with low risk of the disease and high levels induce cancer 
10 :n animals, the question whether androgenicity within the 
normal range is associated with moderate differences in 
risk is unsettled. 

The action of androgens is ultimately mediated through 
the androgen receptor ( AR ) . In transfection experiments, 
15 1 oncer AR polygiut amine repeat lengths encoded by CAG 
repeats are associated with lower transcriptional 
activation function. Two laboratories (Chamberlain, N.L., 
ec al., Nucleic Acids Res., 22:3161-6 (1994); Kazemi- 
Esfar^ani P., ec al., Human Molecular Genetics, 4:523-7 
20 1995) ) have independently established that this 

relationship is length - dependent , and occurs even within 
the normal range of CAG repeats. In contrast to binding of 
the AR to DNA, binding of androgens occurs in a different 
region of the AR which is unaffected by this polymorphism 
25 m poiyglucamine length. Abnormally high CAG repeat length 
(240), which through an unknown mechanism causes 
spmobulbar muscular atrophy or Kennedy syndrome, is 
associated with clinically evert androgen insensit ivity m 
men (LaSpada, A.R., ec al . , Nature, 352:11-9 (1991); 
30 Igarashi, 5., ec al. f Neurology, 42:2300-2 (1992)). Based 
or. a small sample in=16i, women with normal testosterone 
levels but with idiopathic hirsutism exhibited an inverse 
correlation between degree of hirsutism and CAG repeat size 
within the normal range (r = 0.60, ? = 0.01) (Legro, R.3., 
35 ecal., Obstet. 3ynecol , 53:701-6 (1994)}. 
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attributable risk may be quite high. For example, it is 
estimated herein that among men in. the 1 eve s t tertile of 
CAG repeat length, over half of the metastatic cancers are 
attributable to the relatively snort TAG repeat length. 
5 African- American men have cn average higher PSA values, 

about a two-fold higher rate of metastatic prostate disease 
and mortality, and larger turner vcluT.es, even when they 
have equal access tc health care as whites (Brawn, P.N., e: 
al., Cancer, 71:2569-73 (1993); Moul. J . v; . , et al . , JAMA, 
10 274:1277-81 (1995)). Although the similar access to care 
does net assure equivalent utilization, these data are 
strongly indicative cf a more aggressive tumor biology 
among blacks. Black men tend tc nave cn average 
considerably shorter AR CAG repeats than white men in the 
15 U.S.; for example, about 7% cf white mer. have repeat 
lengths less than 19 as compared tc 4 0% cf black men 
(Edwards A., et al . Genomics 12:Z±1-S} '1992)). U.S. black 
men also have higher levels cf tisava liable testosterone 
than' whites (Ross, R.K., et al., J. rati. Cancer Inst., 
20 7£:45-8 (1986)) and appear exposed i: higher levels of 
estrogen and testosterone in uterc (Henderson, B.E., et 
al., Br. J. Cancer, 57:216-18 '1998)'. Both hormonal 
levels and the AR responsitivity may contribute to higher 
'rates of prostate cancer mortality among African-Americans. 
25 The results herein provide strong evidence that the 

variability in the transcriptional activation function of 
the AR is associated with the risk cf developing prostate 
cancer and in particular aggressive prostate cancer. These 
data represents the first known germlme polymorphism 
30 related to tumor promotion or progression, in "sporadic" 

tumors. Moreover, these findings help explain the higher 
rate of prostate cancer mortality amcng black men, the 
tendency for blacks to be diagnosed with more extensive 
disease, and the apparent X-lir.ked component to prostate 
35 cancer risk. Cur results are consistent with a substantial 
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Based or. the study by Kazemi - Esf ar j am ec al . (Kazemi- 
Esfar]ani ec al . , Human Molecular Genetics, 4-.S22-1 

(1995)}, it was estimated that each additional 
polygiutamme repeat would produce approximately a 
2 percent decrease in transcriptional activation function 
by the Ar . Thus, a 12 percent differential in 
transcriptional activation is predicted for an increment o 
6 CAG repeats. Although the magnitude of the effect of 
AR polygiutamme length and transcriptional activation 
function in vitro might appear relatively modest, these 
cirrerences ever a lifetime might have a substantial 
impact. Using a mathematical model which assumes that 
prostate cancer risk is directly proportional tc cumulative 
mitotic activity, Ross ec al . have estimated that a 13% 
difference :n testosterone - st imulated mitotic activity 
would resui: ;r. a 2.3-fold difference in prostate cancer 
incidence (Ross, R.K., Accowpl ishmencs in cancer research, 
219-26, (15^2; . Rcr a decrement cf 6 CAG repeats or about 
12% difference in transcriptional activation, the data 
herein predict a RR of 2.4 for metastatic disease and 2.0 
rcr fatal disease, which are well within the magnitude as 
predicted by the model. These results also suggest that 
androgen stimulation within normal limits is a critical 
determinant cf prostate cancer risk. 

Most known cermiine mutations that confer higher risk of 
cancer te.g. 5RCA1 in breast, mismatch pair genes in colon, 
putative prostate cancer suppressor gene) are characterized 
by early age of disease onset, high population attributable 
at young ages, out a relatively low population attributable 
risk due to the sharply increasing incidence cf "sporadic" 
cancers that occurs with advancing age. In contrast, the 
pattern characterized by the AR CAG polymorphism is that a 
moderate gradient of risk occurs across the spectrum of CAG 
repeats. Because this polymorphism influences the 
progression cf "sporadic" cancers, the population 
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vii) PRIOR APPLICATION DATA: 

lA) APPLICATION irUMBER: V3 08/64 9 ,CS 9 
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effect cf CAG repeat length. Polymorphisms in the AR CAG- 
lengchs has implications regarding prevention, screening, 
and treatment for prostate cancer. 

EQUIVALENTS 

5 Those skilled in the art will know, or be able to 

ascertain, using r.o more than routine experimentation, -any 
equivalents tc the specific embodiments cf the invention 
described herein. These and all other equivalents are. 
intended to be encompassed by the following claims. 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
GCTGTGAAGG TTGCTGTTCC TCAT 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 2S base pairs 
(8 ) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(DJ TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA {genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GCTGATGAAA ACTGTCAAGC TGCTGA 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2S base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 4 : 
GCCAGCTGGC AGAACGCCAG GAGAC 
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(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER : US 08/556,217 

(B) FILING DATE: 09-NOV-199S 
(Ci CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION : 

(A) NAME : Brook, David E . 

( B) REGISTRATION NUMBER : 22,592 

(C) REFERENCE /DOCKET NUMBER: DFCI-461A PCT 

(ixJ TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE : (617) 861-6240 

(B) TELEFAX : (617) 861-S540 

(2 J INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH; 24 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYFE: DNA (genomic) 

(xil SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

TCCAGAATCT GTTCCAGAGC GTGC 

(2) INFORMATION FOR SEQ ID NO : 2 ; 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 24 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



SUBSTITUTE SHEET (RULE 26) 



WO 97/1 7469 



PCT/US96/17789 



CLAIMS 

1. A method for prognosis of prostate cancer in a 
male comprising: fa) determining the length of 
the CAG trinucleotide repeat cf exon 1 of the 

5 androgen receptor gene and/or the length of the 

TA dinucleotide repeat cf the 5 alpha reductase 
Type II gene obtained from DNA cf the male and 
{ b) correlating the length of the repeat with the 
aggressiveness and mortality risk of the cancer 
10 in the male . 

2. The method cf Claim 1 wherein the DNA is genomic 
DNA. 

3. The method of Claim 2 wherein the DNA is obtained 
from non -cancerous cells. 

15 4. The method of Claim 3 wherein the DNA is obtained 
from a tissue or blood sample. 

5. The method of Claim 4 wherein the length of the 
repeat is determined by PGR . 

6 . The method of Claim 4 where ir. the aggressiveness 
20 and mortality risk in the male occurs at the age 

of at least about 60 years. 

1 . The method of Claim 6 wherein the male is at 
least about 60 years of age. 

3 . The method cf Claim 6 wherein ihe male is less 
25 than about 60 years cf age. 
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9. The method of Claim 4 wherein the male is African- 
American, Caucasian or Asian. 

10. The method of Claim 9 wherein the length of the repeat 
is compared with the length of the repeat in males of 

5 the same race as the male. 

11. A method for determining length of a CAG 
trinucleotide repeat in exon 1 of the androgenic 
receptor gene or its complement in a male 
comprising : 

10 (a) obtaining DMA from, the male wherein the 

DN'A comprises the CAG trinucleotide 
repeat of exon 1 cf the androgenic 
receptor gene and /or the length of 
theTA di nucleotide repeat of the 5 

15 alpha reductase Type II gene or its 

complement; and 

(b) determining length of the repeat; and 

(c) comparing the length of the repeat with the 
length cf the repeat in a male population 

2 0 individuals; 

wherein tne length cf the repeat is prognostic of 
the aggressiveness and mortality of the prostate 
cancer . 



